Skip to content

SparseMatricesCSR Dispatch #2720

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

Abdelrahman912
Copy link
Contributor

@Abdelrahman912 Abdelrahman912 commented Mar 27, 2025

This PR adds a new extension module SparseMatricesCSRExt that enables dispatching SparseMatrixCSR from SparseMatricesCSR.jl to CuSparseMatrixCSR.

Copy link

codecov bot commented Mar 27, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 89.00%. Comparing base (af58f61) to head (627c794).

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2720      +/-   ##
==========================================
+ Coverage   88.87%   89.00%   +0.12%     
==========================================
  Files         153      153              
  Lines       13154    13154              
==========================================
+ Hits        11691    11708      +17     
+ Misses       1463     1446      -17     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@maleadt maleadt requested review from amontoison and kshyatt March 27, 2025 17:29
@Abdelrahman912 Abdelrahman912 marked this pull request as ready for review March 28, 2025 00:03
Copy link
Contributor

github-actions bot commented Mar 28, 2025

Your PR no longer requires formatting changes. Thank you for your contribution!

@maleadt maleadt requested a review from kshyatt March 29, 2025 13:48
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CUDA.jl Benchmarks

Benchmark suite Current: 627c794 Previous: af58f61 Ratio
latency/precompile 47219964749 ns 47364047578 ns 1.00
latency/ttfp 6800010126 ns 6821264934 ns 1.00
latency/import 3212836642.5 ns 3225196597.5 ns 1.00
integration/volumerhs 9610171.5 ns 9614354 ns 1.00
integration/byval/slices=1 146882 ns 146613 ns 1.00
integration/byval/slices=3 425144 ns 425074 ns 1.00
integration/byval/reference 144782 ns 144823 ns 1.00
integration/byval/slices=2 286155 ns 286027 ns 1.00
integration/cudadevrt 103242 ns 103297 ns 1.00
kernel/indexing 14023 ns 14019 ns 1.00
kernel/indexing_checked 14532 ns 14585 ns 1.00
kernel/occupancy 687.8533333333334 ns 721.3333333333334 ns 0.95
kernel/launch 2110 ns 2085.8 ns 1.01
kernel/rand 17528 ns 18047 ns 0.97
array/reverse/1d 19719 ns 19477 ns 1.01
array/reverse/2d 25105 ns 24090.5 ns 1.04
array/reverse/1d_inplace 11120 ns 10504 ns 1.06
array/reverse/2d_inplace 13011 ns 12197 ns 1.07
array/copy 21291 ns 21209 ns 1.00
array/iteration/findall/int 157272 ns 158069.5 ns 0.99
array/iteration/findall/bool 138618.5 ns 139342 ns 0.99
array/iteration/findfirst/int 153006.5 ns 153958 ns 0.99
array/iteration/findfirst/bool 154403 ns 154465 ns 1.00
array/iteration/scalar 70522 ns 72075 ns 0.98
array/iteration/logical 213281.5 ns 215491 ns 0.99
array/iteration/findmin/1d 41124 ns 41679 ns 0.99
array/iteration/findmin/2d 94274 ns 94340 ns 1.00
array/reductions/reduce/1d 35396 ns 35947 ns 0.98
array/reductions/reduce/2d 41051 ns 41316 ns 0.99
array/reductions/mapreduce/1d 33102 ns 33483 ns 0.99
array/reductions/mapreduce/2d 40882.5 ns 40997 ns 1.00
array/broadcast 20761.5 ns 20733 ns 1.00
array/copyto!/gpu_to_gpu 13749 ns 13434 ns 1.02
array/copyto!/cpu_to_gpu 207750 ns 208335 ns 1.00
array/copyto!/gpu_to_cpu 243108.5 ns 243244 ns 1.00
array/accumulate/1d 109745.5 ns 109403 ns 1.00
array/accumulate/2d 80219 ns 80261 ns 1.00
array/construct 1247.5 ns 1244.9 ns 1.00
array/random/randn/Float32 44314 ns 43677.5 ns 1.01
array/random/randn!/Float32 26326 ns 26376 ns 1.00
array/random/rand!/Int64 27062 ns 27073 ns 1.00
array/random/rand!/Float32 8598.333333333334 ns 8572 ns 1.00
array/random/rand/Int64 33597 ns 29918 ns 1.12
array/random/rand/Float32 12971 ns 12871 ns 1.01
array/permutedims/4d 61325 ns 61023 ns 1.00
array/permutedims/2d 55343 ns 55334 ns 1.00
array/permutedims/3d 55951 ns 55987.5 ns 1.00
array/sorting/1d 2775342.5 ns 2774813 ns 1.00
array/sorting/by 3367068 ns 3365701 ns 1.00
array/sorting/2d 1084272 ns 1084786 ns 1.00
cuda/synchronization/stream/auto 1043.1 ns 1038.1 ns 1.00
cuda/synchronization/stream/nonblocking 6521.2 ns 6575 ns 0.99
cuda/synchronization/stream/blocking 833.1764705882352 ns 798.2788461538462 ns 1.04
cuda/synchronization/context/auto 1189.9 ns 1153 ns 1.03
cuda/synchronization/context/nonblocking 6686.2 ns 6748.6 ns 0.99
cuda/synchronization/context/blocking 932.2051282051282 ns 888.7142857142857 ns 1.05

This comment was automatically generated by workflow using github-action-benchmark.

@Abdelrahman912
Copy link
Contributor Author

This PR should be ready by now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants